## Reading layer `studentsData' from data source `/Users/julianazhou/Documents/GitHub/Miami/studentsData.geojson' using driver `GeoJSON'
## Simple feature collection with 3503 features and 58 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -80.31756 ymin: 25.70911 xmax: -80.12061 ymax: 25.87091
## geographic CRS: WGS 84
## Reading layer `Municipality_poly' from data source `https://opendata.arcgis.com/datasets/5ece0745e24b4617a49f2e098df8117f_0.geojson' using driver `GeoJSON'
## Simple feature collection with 78 features and 11 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -80.87358 ymin: 25.13742 xmax: -80.04276 ymax: 25.97944
## geographic CRS: WGS 84
## Reading layer `Shoreline' from data source `https://opendata.arcgis.com/datasets/58386199cc234518822e5f34f65eb713_0.geojson' using driver `GeoJSON'
## Simple feature collection with 1628 features and 6 fields
## geometry type:  MULTILINESTRING
## dimension:      XY
## bbox:           xmin: -80.85817 ymin: 25.00041 xmax: -80.11807 ymax: 25.97517
## geographic CRS: WGS 84
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |==============                                                        |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |==============                                                        |  21%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |===========================                                           |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |============================                                          |  41%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |============================================                          |  64%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |=================================================                     |  71%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |======================================================                |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |===========================================================           |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |=============================================================         |  88%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%
## Reading layer `MiddleAttendanceBoundary' from data source `https://opendata.arcgis.com/datasets/dd2719ff6105463187197165a9c8dd5c_0.geojson' using driver `GeoJSON'
## Simple feature collection with 54 features and 17 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -80.67463 ymin: 25.37431 xmax: -80.11807 ymax: 25.97515
## geographic CRS: WGS 84
Reg1 <- lm(miamiHomes.train$SalePrice ~ ., data = miamiHomes.train %>%
             st_drop_geometry() %>%
  dplyr::select(Shore1, MedHHInc, TotalPop, MedRent, pctWhite, pctPoverty, 
                Brownsville.MS, CitrusGrove.MS, JosedeDiego.MS, GeorgiaJA.MS, 
                KinlochPk.MS, Madison.MS, Nautilus.MS, Shenandoah.MS, WestMiami.MS,))

Reg2 <- lm(miamiHomes.train$SalePrice ~ ., data = miamiHomes.train %>%
             st_drop_geometry()    %>%
             dplyr::select(-GEOID, -ID, -toPredict))

summary(Reg1)
## 
## Call:
## lm(formula = miamiHomes.train$SalePrice ~ ., data = miamiHomes.train %>% 
##     st_drop_geometry() %>% dplyr::select(Shore1, MedHHInc, TotalPop, 
##     MedRent, pctWhite, pctPoverty, Brownsville.MS, CitrusGrove.MS, 
##     JosedeDiego.MS, GeorgiaJA.MS, KinlochPk.MS, Madison.MS, Nautilus.MS, 
##     Shenandoah.MS, WestMiami.MS, ))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -505816  -63281   -6461   49079  692032 
## 
## Coefficients: (1 not defined because of singularities)
##                    Estimate   Std. Error t value      Pr(>|t|)    
## (Intercept)     274559.3516   50967.1119   5.387 0.00000008255 ***
## Shore1              -5.7452       1.2288  -4.676 0.00000318341 ***
## MedHHInc             1.3698       0.2344   5.844 0.00000000620 ***
## TotalPop             4.4526       1.7971   2.478       0.01333 *  
## MedRent              8.0105      16.9273   0.473       0.63611    
## pctWhite         87738.7535   21213.9059   4.136 0.00003722772 ***
## pctPoverty      -65160.5762   46955.0564  -1.388       0.16542    
## Brownsville.MS  -74321.8984   35665.2339  -2.084       0.03733 *  
## CitrusGrove.MS  -63085.3758   33757.2289  -1.869       0.06184 .  
## JosedeDiego.MS  -24574.0248   36087.1737  -0.681       0.49600    
## GeorgiaJA.MS    -83659.5425   33968.4429  -2.463       0.01389 *  
## KinlochPk.MS    -20898.1255   23871.5518  -0.875       0.38147    
## Madison.MS     -102752.0450   89066.6696  -1.154       0.24882    
## Nautilus.MS     229234.0043   37774.8118   6.068 0.00000000162 ***
## Shenandoah.MS    99482.5601   31097.1370   3.199       0.00141 ** 
## WestMiami.MS             NA           NA      NA            NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 113700 on 1569 degrees of freedom
##   (482 observations deleted due to missingness)
## Multiple R-squared:  0.6029, Adjusted R-squared:  0.5994 
## F-statistic: 170.2 on 14 and 1569 DF,  p-value: < 0.00000000000000022
summary(Reg2)
## 
## Call:
## lm(formula = miamiHomes.train$SalePrice ~ ., data = miamiHomes.train %>% 
##     st_drop_geometry() %>% dplyr::select(-GEOID, -ID, -toPredict))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -633041  -52563    -324   44778  716398 
## 
## Coefficients: (2 not defined because of singularities)
##                                   Estimate        Std. Error t value
## (Intercept)               -5246.4350534539 142697.2837201184  -0.037
## Folio                         0.0000000852      0.0000010166   0.084
## Property.CityMiami Beach 220589.4402736548 102729.5109330711   2.147
## LotSize                      17.9744511198      1.6598697438  10.829
## Bed                        8653.6103385339   4483.3273688662   1.930
## Bath                       4613.3433411990   5440.1497794038   0.848
## Stories                   13854.9186849976  11214.3749333794   1.235
## Pool                      77281.6482034501   9820.8922735323   7.869
## Fence                      -149.0504182702   5646.3493174582  -0.026
## Patio                      4073.1199787392   5115.9393321635   0.796
## ActualSqFt                   67.0325323507      6.2306983088  10.758
## Age                        -698.9747848495    147.1481424575  -4.750
## Shore1                       -3.3692430543      1.0299999168  -3.271
## MedHHInc                      1.0907134061      0.1933144376   5.642
## TotalPop                      3.3995118427      1.4813680114   2.295
## MedRent                      10.1107571325     13.9704575462   0.724
## pctWhite                  72584.5891876008  18038.4043269414   4.024
## pctPoverty               -28329.3231192995  38669.4205943405  -0.733
## Brownsville.MS           -19595.1411441002  29848.9697522498  -0.656
## CitrusGrove.MS           -16536.9740318180  27908.0093871838  -0.593
## JosedeDiego.MS            41689.6871928630  30171.4322819578   1.382
## GeorgiaJA.MS             -21745.7906222292  28569.9668882082  -0.761
## KinlochPk.MS               7151.5470039981  19733.8734794334   0.362
## Madison.MS               -24901.9379151104  73254.8959734925  -0.340
## Nautilus.MS                             NA                NA      NA
## Shenandoah.MS            122292.1069263248  25991.1247964894   4.705
## WestMiami.MS                            NA                NA      NA
##                                      Pr(>|t|)    
## (Intercept)                           0.97068    
## Folio                                 0.93322    
## Property.CityMiami Beach              0.03192 *  
## LotSize                  < 0.0000000000000002 ***
## Bed                                   0.05377 .  
## Bath                                  0.39656    
## Stories                               0.21685    
## Pool                      0.00000000000000663 ***
## Fence                                 0.97894    
## Patio                                 0.42606    
## ActualSqFt               < 0.0000000000000002 ***
## Age                       0.00000221955735896 ***
## Shore1                                0.00109 ** 
## MedHHInc                  0.00000001990953696 ***
## TotalPop                              0.02187 *  
## MedRent                               0.46934    
## pctWhite                  0.00005998767470287 ***
## pctPoverty                            0.46391    
## Brownsville.MS                        0.51161    
## CitrusGrove.MS                        0.55357    
## JosedeDiego.MS                        0.16724    
## GeorgiaJA.MS                          0.44669    
## KinlochPk.MS                          0.71710    
## Madison.MS                            0.73395    
## Nautilus.MS                                NA    
## Shenandoah.MS             0.00000276128964864 ***
## WestMiami.MS                               NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 93070 on 1559 degrees of freedom
##   (482 observations deleted due to missingness)
## Multiple R-squared:  0.7358, Adjusted R-squared:  0.7318 
## F-statistic: 180.9 on 24 and 1559 DF,  p-value: < 0.00000000000000022
summ(Reg1)
Observations 1584 (482 missing obs. deleted)
Dependent variable miamiHomes.train$SalePrice
Type OLS linear regression
F(14,1569) 170.16
R² 0.60
Adj. R² 0.60
Est. S.E. t val. p
(Intercept) 274559.35 50967.11 5.39 0.00
Shore1 -5.75 1.23 -4.68 0.00
MedHHInc 1.37 0.23 5.84 0.00
TotalPop 4.45 1.80 2.48 0.01
MedRent 8.01 16.93 0.47 0.64
pctWhite 87738.75 21213.91 4.14 0.00
pctPoverty -65160.58 46955.06 -1.39 0.17
Brownsville.MS -74321.90 35665.23 -2.08 0.04
CitrusGrove.MS -63085.38 33757.23 -1.87 0.06
JosedeDiego.MS -24574.02 36087.17 -0.68 0.50
GeorgiaJA.MS -83659.54 33968.44 -2.46 0.01
KinlochPk.MS -20898.13 23871.55 -0.88 0.38
Madison.MS -102752.05 89066.67 -1.15 0.25
Nautilus.MS 229234.00 37774.81 6.07 0.00
Shenandoah.MS 99482.56 31097.14 3.20 0.00
WestMiami.MS NA NA NA NA
Standard errors: OLS
stargazer(Reg1, Reg2, title="Training Set LM Results", align=TRUE, type = "html", out = "Regression_Results.htm")
## 
## <table style="text-align:center"><caption><strong>Training Set LM Results</strong></caption>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td colspan="2"><em>Dependent variable:</em></td></tr>
## <tr><td></td><td colspan="2" style="border-bottom: 1px solid black"></td></tr>
## <tr><td style="text-align:left"></td><td colspan="2">SalePrice</td></tr>
## <tr><td style="text-align:left"></td><td>(1)</td><td>(2)</td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Folio</td><td></td><td>0.00000</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(0.00000)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Property.CityMiami Beach</td><td></td><td>220,589.400<sup>**</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(102,729.500)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">LotSize</td><td></td><td>17.974<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(1.660)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Bed</td><td></td><td>8,653.610<sup>*</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(4,483.327)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Bath</td><td></td><td>4,613.343</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(5,440.150)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Stories</td><td></td><td>13,854.920</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(11,214.380)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Pool</td><td></td><td>77,281.650<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(9,820.892)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Fence</td><td></td><td>-149.050</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(5,646.349)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Patio</td><td></td><td>4,073.120</td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(5,115.939)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">ActualSqFt</td><td></td><td>67.033<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(6.231)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Age</td><td></td><td>-698.975<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td></td><td>(147.148)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Shore1</td><td>-5.745<sup>***</sup></td><td>-3.369<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(1.229)</td><td>(1.030)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">MedHHInc</td><td>1.370<sup>***</sup></td><td>1.091<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(0.234)</td><td>(0.193)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">TotalPop</td><td>4.453<sup>**</sup></td><td>3.400<sup>**</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(1.797)</td><td>(1.481)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">MedRent</td><td>8.011</td><td>10.111</td></tr>
## <tr><td style="text-align:left"></td><td>(16.927)</td><td>(13.970)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">pctWhite</td><td>87,738.750<sup>***</sup></td><td>72,584.590<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(21,213.910)</td><td>(18,038.400)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">pctPoverty</td><td>-65,160.580</td><td>-28,329.320</td></tr>
## <tr><td style="text-align:left"></td><td>(46,955.060)</td><td>(38,669.420)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Brownsville.MS</td><td>-74,321.900<sup>**</sup></td><td>-19,595.140</td></tr>
## <tr><td style="text-align:left"></td><td>(35,665.230)</td><td>(29,848.970)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">CitrusGrove.MS</td><td>-63,085.380<sup>*</sup></td><td>-16,536.970</td></tr>
## <tr><td style="text-align:left"></td><td>(33,757.230)</td><td>(27,908.010)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">JosedeDiego.MS</td><td>-24,574.030</td><td>41,689.690</td></tr>
## <tr><td style="text-align:left"></td><td>(36,087.170)</td><td>(30,171.430)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">GeorgiaJA.MS</td><td>-83,659.540<sup>**</sup></td><td>-21,745.790</td></tr>
## <tr><td style="text-align:left"></td><td>(33,968.440)</td><td>(28,569.970)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">KinlochPk.MS</td><td>-20,898.120</td><td>7,151.547</td></tr>
## <tr><td style="text-align:left"></td><td>(23,871.550)</td><td>(19,733.870)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Madison.MS</td><td>-102,752.000</td><td>-24,901.940</td></tr>
## <tr><td style="text-align:left"></td><td>(89,066.670)</td><td>(73,254.900)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Nautilus.MS</td><td>229,234.000<sup>***</sup></td><td></td></tr>
## <tr><td style="text-align:left"></td><td>(37,774.810)</td><td></td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Shenandoah.MS</td><td>99,482.560<sup>***</sup></td><td>122,292.100<sup>***</sup></td></tr>
## <tr><td style="text-align:left"></td><td>(31,097.140)</td><td>(25,991.120)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">WestMiami.MS</td><td></td><td></td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td style="text-align:left">Constant</td><td>274,559.400<sup>***</sup></td><td>-5,246.435</td></tr>
## <tr><td style="text-align:left"></td><td>(50,967.110)</td><td>(142,697.300)</td></tr>
## <tr><td style="text-align:left"></td><td></td><td></td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Observations</td><td>1,584</td><td>1,584</td></tr>
## <tr><td style="text-align:left">R<sup>2</sup></td><td>0.603</td><td>0.736</td></tr>
## <tr><td style="text-align:left">Adjusted R<sup>2</sup></td><td>0.599</td><td>0.732</td></tr>
## <tr><td style="text-align:left">Residual Std. Error</td><td>113,746.600 (df = 1569)</td><td>93,070.580 (df = 1559)</td></tr>
## <tr><td style="text-align:left">F Statistic</td><td>170.158<sup>***</sup> (df = 14; 1569)</td><td>180.949<sup>***</sup> (df = 24; 1559)</td></tr>
## <tr><td colspan="3" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"><em>Note:</em></td><td colspan="2" style="text-align:right"><sup>*</sup>p<0.1; <sup>**</sup>p<0.05; <sup>***</sup>p<0.01</td></tr>
## </table>
#GEOID R2 = .3, MailingZip =.4, PropertyZip =.9
## 
## Call:
## lm(formula = SalePrice ~ ., data = miami.training %>% st_drop_geometry() %>% 
##     dplyr::select(-GEOID, -ID, -toPredict))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -302587  -51571   -2064   44957  614688 
## 
## Coefficients: (2 not defined because of singularities)
##                                  Estimate       Std. Error t value
## (Intercept)               78136.837213698 176059.192638503   0.444
## Folio                        -0.000000385      0.000001255  -0.307
## Property.CityMiami Beach 276454.999110661 127451.105147057   2.169
## LotSize                      16.317702062      2.038776726   8.004
## Bed                        5988.517104382   5541.684530678   1.081
## Bath                       5341.577560724   6854.121820066   0.779
## Stories                   20516.913244497  13971.424146060   1.468
## Pool                      76118.434570713  12024.746850741   6.330
## Fence                     -5153.231757390   7177.774007681  -0.718
## Patio                      2589.646364106   6402.120172277   0.404
## ActualSqFt                   69.342553528      8.107934838   8.552
## Age                        -682.246009530    189.026082349  -3.609
## Shore1                       -4.093915075      1.292472764  -3.168
## MedHHInc                      1.254048187      0.244696272   5.125
## TotalPop                      4.977026127      1.842105838   2.702
## MedRent                      -1.208459171     18.291532168  -0.066
## pctWhite                  52237.978936399  23119.019761861   2.260
## pctPoverty               -26272.643845086  48485.282085386  -0.542
## Brownsville.MS           -17771.098702723  37362.287052672  -0.476
## CitrusGrove.MS           -19914.470114837  34995.870977921  -0.569
## JosedeDiego.MS            34037.919499548  37607.403497168   0.905
## GeorgiaJA.MS             -25082.785286265  35628.510515297  -0.704
## KinlochPk.MS              23587.524677210  25417.575417248   0.928
## Madison.MS               -62121.192559062  97745.969841958  -0.636
## Nautilus.MS                            NA               NA      NA
## Shenandoah.MS            134605.350370903  32631.951861755   4.125
## WestMiami.MS                           NA               NA      NA
##                                      Pr(>|t|)    
## (Intercept)                          0.657284    
## Folio                                0.759051    
## Property.CityMiami Beach             0.030329 *  
## LotSize                   0.00000000000000361 ***
## Bed                                  0.280143    
## Bath                                 0.435989    
## Stories                              0.142312    
## Pool                      0.00000000038172082 ***
## Fence                                0.472974    
## Patio                                0.685940    
## ActualSqFt               < 0.0000000000000002 ***
## Age                                  0.000324 ***
## Shore1                               0.001588 ** 
## MedHHInc                  0.00000036255562183 ***
## TotalPop                             0.007023 ** 
## MedRent                              0.947339    
## pctWhite                             0.024083 *  
## pctPoverty                           0.588040    
## Brownsville.MS                       0.634442    
## CitrusGrove.MS                       0.569459    
## JosedeDiego.MS                       0.365656    
## GeorgiaJA.MS                         0.481605    
## KinlochPk.MS                         0.353650    
## Madison.MS                           0.525236    
## Nautilus.MS                                NA    
## Shenandoah.MS             0.00004042762290486 ***
## WestMiami.MS                               NA    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 89360 on 923 degrees of freedom
##   (294 observations deleted due to missingness)
## Multiple R-squared:  0.7519, Adjusted R-squared:  0.7454 
## F-statistic: 116.5 on 24 and 923 DF,  p-value: < 0.00000000000000022
## Train MAE:  195610  
##  Test MAE:  61
fitControl <- trainControl(method = "cv", 
                           number = 100,
                           # savePredictions differs from book
                           savePredictions = TRUE)

set.seed(73506)
#No neighborhoods R2=0.788, MAE=525,277

reg.cv2 <- 
  train(SalePrice ~ ., data = miamiHomes.train %>%
          st_drop_geometry() %>%
          dplyr::select(-ActualSqFt, -TotalPop, -ID, 
                        -MedHHInc, -toPredict), 
        method = "lm", 
        trControl = fitControl, 
        na.action = na.omit)

reg.cv2
## Linear Regression 
## 
## 2066 samples
##   24 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (100 fold) 
## Summary of sample sizes: 1568, 1568, 1567, 1568, 1568, 1568, ... 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   88416.28  0.7556264  65775.12
## 
## Tuning parameter 'intercept' was held constant at a value of TRUE
#Providing Results of Cross Validation Test
reg.cv2$results %>%
  knitr::kable()
intercept RMSE Rsquared MAE RMSESD RsquaredSD MAESD
TRUE 88416.28 0.7556264 65775.12 25971.98 0.1477249 16997.15
#Creating an MAE Histogram
MAE_hist <- as.data.frame(reg.cv2$pred) %>%
  mutate(ABS_error = abs(pred-obs)) %>%
  group_by(Resample) %>%
  summarize(MAE = mean(ABS_error, na.rm = T))

ggplot(MAE_hist, aes(x=MAE))+geom_histogram()+
  labs(title="Histogram of Cross Validation MAE",
       caption="Based on 100 Folds")+
  theme_classic()

#Plotting Predicted Prices as a Function of Observed Prices

regCV2plot <- as.data.frame(reg.cv2$pred)


ggplot(regCV2plot,aes(obs,pred)) +
  geom_point() +
  stat_smooth(aes(obs, obs), 
              method = "lm", se = FALSE, size = 1, colour="#FA7800") + 
  stat_smooth(aes(obs, pred), 
              method = "lm", se = FALSE, size = 1, colour="#25CB10") +
  labs(title="Predicted sale price as a function of observed price",
       subtitle="Orange line represents a perfect prediction; Green line represents prediction",
       x="Sale Price",
       y="Predicted Price") +
  plotTheme() + theme(plot.title = element_text(size = 18, colour = "black")) 

## [1] 2066
## [1] 1584

#our best model was reg.cv2

1 — Markdown: Introduction —-

## 
## <table style="text-align:center"><caption><strong>Summary Statistics</strong></caption>
## <tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">Statistic</td><td>N</td><td>Mean</td><td>St. Dev.</td><td>Min</td><td>Max</td></tr>
## <tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">SalePrice</td><td>2,066</td><td>405,476.400</td><td>199,741.700</td><td>12,500</td><td>1,000,000</td></tr>
## <tr><td style="text-align:left">Miami.dummy</td><td>2,066</td><td>0.953</td><td>0.212</td><td>0</td><td>1</td></tr>
## <tr><td style="text-align:left">LotSize</td><td>2,066</td><td>6,360.875</td><td>1,721.617</td><td>1,250</td><td>17,620</td></tr>
## <tr><td style="text-align:left">Age</td><td>2,066</td><td>70.954</td><td>18.186</td><td>-1</td><td>115</td></tr>
## <tr><td style="text-align:left">Stories</td><td>2,066</td><td>1.073</td><td>0.265</td><td>0</td><td>3</td></tr>
## <tr><td style="text-align:left">Bed</td><td>2,066</td><td>2.692</td><td>0.794</td><td>0</td><td>8</td></tr>
## <tr><td style="text-align:left">Bath</td><td>2,066</td><td>1.611</td><td>0.700</td><td>0</td><td>6</td></tr>
## <tr><td style="text-align:left">Pool</td><td>2,066</td><td>0.108</td><td>0.310</td><td>0</td><td>1</td></tr>
## <tr><td style="text-align:left">Fence</td><td>2,066</td><td>0.738</td><td>0.440</td><td>0</td><td>1</td></tr>
## <tr><td style="text-align:left">Patio</td><td>2,066</td><td>0.499</td><td>0.500</td><td>0</td><td>1</td></tr>
## <tr><td style="text-align:left">Shore1</td><td>2,066</td><td>7,047.549</td><td>5,248.614</td><td>88.597</td><td>26,528.540</td></tr>
## <tr><td style="text-align:left">MedRent</td><td>2,040</td><td>1,042.535</td><td>311.133</td><td>246.000</td><td>2,297.000</td></tr>
## <tr><td style="text-align:left">pctWhite</td><td>2,062</td><td>0.703</td><td>0.320</td><td>0.057</td><td>0.989</td></tr>
## <tr><td style="text-align:left">pctPoverty</td><td>2,062</td><td>0.217</td><td>0.108</td><td>0.052</td><td>0.556</td></tr>
## <tr><td style="text-align:left">Brownsville.MS</td><td>1,588</td><td>0.098</td><td>0.298</td><td>0.000</td><td>1.000</td></tr>
## <tr><td style="text-align:left">CitrusGrove.MS</td><td>1,588</td><td>0.115</td><td>0.319</td><td>0.000</td><td>1.000</td></tr>
## <tr><td style="text-align:left">JosedeDiego.MS</td><td>1,588</td><td>0.129</td><td>0.335</td><td>0.000</td><td>1.000</td></tr>
## <tr><td style="text-align:left">GeorgiaJA.MS</td><td>1,588</td><td>0.133</td><td>0.340</td><td>0.000</td><td>1.000</td></tr>
## <tr><td style="text-align:left">KinlochPk.MS</td><td>1,588</td><td>0.196</td><td>0.397</td><td>0.000</td><td>1.000</td></tr>
## <tr><td style="text-align:left">Madison.MS</td><td>1,588</td><td>0.001</td><td>0.035</td><td>0.000</td><td>1.000</td></tr>
## <tr><td style="text-align:left">Nautilus.MS</td><td>1,588</td><td>0.061</td><td>0.240</td><td>0.000</td><td>1.000</td></tr>
## <tr><td style="text-align:left">Shenandoah.MS</td><td>1,588</td><td>0.243</td><td>0.429</td><td>0.000</td><td>1.000</td></tr>
## <tr><td style="text-align:left">WestMiami.MS</td><td>1,588</td><td>0.024</td><td>0.153</td><td>0.000</td><td>1.000</td></tr>
## <tr><td colspan="6" style="border-bottom: 1px solid black"></td></tr></table>
## 
## <table style="text-align:center"><caption><strong>Summary Statistics</strong></caption>
## <tr><td colspan="1" style="border-bottom: 1px solid black"></td></tr><tr><td>asis</td></tr>
## <tr><td colspan="1" style="border-bottom: 1px solid black"></td></tr></table>
## 
## <table style="text-align:center"><caption><strong>Summary Statistics</strong></caption>
## <tr><td colspan="1" style="border-bottom: 1px solid black"></td></tr><tr><td>FALSE</td></tr>
## <tr><td colspan="1" style="border-bottom: 1px solid black"></td></tr></table>

# — Markdown: Method —-

2 — Markdown: Results —-

# Is there a spatial correlation of errors?

# get index for training sample
inTrain <- caret::createDataPartition(
  y = miamiHomes.train$SalePrice, 
  p = .60, list = FALSE)
# split data into training and test
miami.training <- miamiHomes.train[inTrain,] 
miami.test     <- miamiHomes.train[-inTrain,]  

#this is the dataset and variables we used in regCV2
#however I had to drop GEOID since there was a levels error
finaltraining <- lm(SalePrice ~ ., data = miami.training %>%
                      st_drop_geometry() %>%
                   dplyr::select(-GEOID, -ID, 
                                 -TotalPop, -MedHHInc, -toPredict))

miami.test <-
  miami.test %>%
  mutate(Regression = "Baseline Regression",
         SalePrice.Predict = predict(finaltraining, miami.test),
         SalePrice.Error = SalePrice.Predict - SalePrice,
         SalePrice.AbsError = abs(SalePrice.Predict - SalePrice),
         SalePrice.APE = (abs(SalePrice.Predict - SalePrice)) / SalePrice.Predict)
## Warning: Problem with `mutate()` input `SalePrice.Predict`.
## ℹ prediction from a rank-deficient fit may be misleading
## ℹ Input `SalePrice.Predict` is `predict(finaltraining, miami.test)`.
## Warning in predict.lm(finaltraining, miami.test): prediction from a rank-
## deficient fit may be misleading
k_nearest_neighbors = 5
#prices
coords <- st_coordinates(st_centroid(miamiHomesClean.sf))
## Warning in st_centroid.sf(miamiHomesClean.sf): st_centroid assumes attributes
## are constant over geometries of x
# k nearest neighbors
neighborList <- knn2nb(knearneigh(coords, k_nearest_neighbors))
## Warning in knearneigh(coords, k_nearest_neighbors): knearneigh: identical points
## found
spatialWeights <- nb2listw(neighborList, style="W")
miamiHomesClean.sf$lagPrice <- lag.listw(spatialWeights, miamiHomesClean.sf$SalePrice)

miami.test.lag <- drop_na(miami.test)
#errors
coords.test <- st_coordinates(st_centroid(miami.test.lag))
## Warning in st_centroid.sf(miami.test.lag): st_centroid assumes attributes are
## constant over geometries of x
neighborList.test <- knn2nb(knearneigh(coords.test, k_nearest_neighbors))
## Warning in knearneigh(coords.test, k_nearest_neighbors): knearneigh: identical
## points found
spatialWeights.test <- nb2listw(neighborList.test, style="W")
miami.test.lag$lagPriceError <- lag.listw(spatialWeights.test, miami.test.lag$SalePrice.AbsError)

#this is breaking because of annotation_map & geom_sf miami base
ggplot(miamiHomesClean.sf, aes(x=lagPrice, y=SalePrice)) +
  geom_point(colour = "#FA7800") +
  geom_smooth(method = "lm", se = FALSE, colour = "#25CB10") +
  labs(title = "Price as a function of the spatial lag of price",
       caption = "Public Policy Analytics, Figure 6.6",
       x = "Spatial lag of price (Mean price of 5 nearest neighbors)",
       y = "Sale Price") +
  plotTheme()
## `geom_smooth()` using formula 'y ~ x'

ggplot(miami.test.lag, aes(x=lagPriceError, y=SalePrice)) +
  geom_point(colour = "#FA7800") +
  geom_smooth(method = "lm", se = FALSE, colour = "#25CB10") +
  labs(title = "Error as a function of the spatial lag of price",
       caption = "",
       x = "Spatial lag of errors (Mean error of 5 nearest neighbors)",
       y = "Sale Price") +
  plotTheme()
## `geom_smooth()` using formula 'y ~ x'

#Moran's I Test
moranTest <- moran.mc(miami.test.lag$SalePrice.AbsError, 
                      spatialWeights.test, nsim = 999, na.action=na.exclude)

ggplot(as.data.frame(moranTest$res[c(1:999)]), aes(moranTest$res[c(1:999)])) +
  geom_histogram(binwidth = 0.01) +
  geom_vline(aes(xintercept = moranTest$statistic), colour = "#FA7800",size=1) +
  scale_x_continuous(limits = c(-1, 1)) +
  labs(title="Observed and permuted Moran's I",
       subtitle= "Observed Moran's I in orange",
       x="Moran's I",
       y="Count",
       caption="Public Policy Analytics, Figure 6.8") +
  plotTheme()
## Warning: Removed 2 rows containing missing values (geom_bar).

GEOID meanPrice meanPrediction meanMAE
12086003001 915000.0 268327.05 646672.946
12086005202 700000.0 445685.61 254314.392
12086004304 307000.0 525805.20 218805.197
12086003912 539000.0 577841.20 211835.504
12086003004 575000.0 437675.04 137324.962
12086003915 591785.7 688403.10 132921.677
12086006601 624993.1 522584.59 126780.120
12086005102 436571.4 316573.29 120398.032
12086002004 150000.0 268285.44 118285.438
12086004102 700000.0 584514.34 115485.658
12086005410 340000.0 451302.58 111302.580
12086006503 434000.0 459507.25 109272.858
12086003906 794950.0 771691.90 107937.653
12086006702 690461.5 603962.64 105641.596
12086002900 238000.0 336254.91 104279.592
12086006403 552071.4 512562.37 97561.292
12086002201 437045.5 444508.34 96206.119
12086005407 419100.0 426226.95 92830.740
12086003916 602333.3 690492.62 88159.283
12086006900 494965.0 515227.53 86630.053
12086003003 358750.0 442894.40 84144.398
12086002202 329363.9 317885.04 81962.040
12086004000 776714.3 751661.83 78755.202
12086006402 471170.6 466438.30 69228.772
12086005104 307500.0 355688.48 69206.464
12086005600 378323.5 380727.01 66111.009
12086006401 416208.3 456677.65 65606.356
12086001501 149823.5 96667.49 65155.251
12086002003 215557.1 251433.27 63769.424
12086005001 315233.3 360172.70 62695.229
12086001802 204153.8 219469.02 62282.077
12086002600 370760.0 332049.58 61184.751
12086005801 344769.0 349191.94 60101.193
12086004106 812925.0 814527.02 59134.433
12086005302 245600.0 303511.05 57911.052
12086006504 452777.8 506102.72 57820.995
12086002501 256950.0 252675.70 57534.074
12086004902 332882.4 342138.81 57330.224
12086006501 463807.7 479835.59 55738.458
12086001904 215546.2 213373.50 54032.128
12086001801 198980.0 196910.64 51976.306
12086001901 187181.8 205221.51 50874.559
12086005201 250000.0 300260.98 50260.979
12086005405 291500.0 317920.50 49823.456
12086002402 237066.7 275337.91 49122.809
12086001903 227722.2 237934.07 49075.425
12086002300 234197.4 213052.21 46524.769
12086005502 331000.0 332577.31 43788.613
12086005802 338025.7 335565.40 43297.792
12086005701 332344.4 315157.65 39424.029
12086005704 347100.0 333505.12 37294.808
12086005002 312454.5 340536.20 36677.606
12086002403 278757.1 315030.59 36273.451
12086005406 319500.0 286922.37 32577.630
12086490100 348500.0 345176.76 29316.631
12086005501 313000.0 292218.20 28602.688
12086005403 300000.0 290351.41 20285.762
12086002502 235625.0 216167.54 19457.458
12086002404 237833.3 227770.79 10062.541
12086003602 371000.0 364975.68 6024.317
12086003100 330000.0 324816.59 5183.408
12086005103 265000.0 261525.08 3474.923
12086001301 503615.8 NaN NaN
12086001302 575275.0 NaN NaN
12086001401 194180.0 NaN NaN
12086001402 225000.0 NaN NaN
12086002001 189940.0 NaN NaN
12086002100 758181.8 NaN NaN
12086006301 392666.7 NaN NaN
12086006302 409836.8 NaN NaN
12086006801 745872.7 NaN NaN
12086006802 759500.0 NaN NaN
12086007001 347500.0 NaN NaN
12086007002 391250.0 NaN NaN
12086007101 559285.7 NaN NaN
12086007104 545000.0 NaN NaN
12086007200 353516.7 NaN NaN
12086007300 735517.6 NaN NaN

## [1] 37283
## Linear Regression 
## 
## 403 samples
##  23 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (100 fold) 
## Summary of sample sizes: 399, 400, 398, 399, 398, 398, ... 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   130147.8  0.7933946  93060.55
## 
## Tuning parameter 'intercept' was held constant at a value of TRUE
## Linear Regression 
## 
## 414 samples
##  25 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (100 fold) 
## Summary of sample sizes: 291, 290, 290, 290, 290, 290, ... 
## Resampling results:
## 
##   RMSE      Rsquared  MAE    
##   98912.42  0.709759  82492.1
## 
## Tuning parameter 'intercept' was held constant at a value of TRUE
## Linear Regression 
## 
## 414 samples
##  25 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (100 fold) 
## Summary of sample sizes: 291, 290, 290, 290, 290, 290, ... 
## Resampling results:
## 
##   RMSE      Rsquared  MAE    
##   98912.42  0.709759  82492.1
## 
## Tuning parameter 'intercept' was held constant at a value of TRUE
intercept RMSE Rsquared MAE RMSESD RsquaredSD MAESD
TRUE 130147.8 0.7933946 93060.55 354594.9 0.2778949 210570.7
kable(miamiRichLM$results)
intercept RMSE Rsquared MAE RMSESD RsquaredSD MAESD
TRUE 98912.42 0.709759 82492.1 52113.41 0.271611 39852.27